Issues in building general letter to sound rules

نویسندگان

  • Alan W. Black
  • Kevin A. Lenzo
  • Vincent Pagel
چکیده

In general text-to-speech systems, it is not possible to guarantee that a lexicon will contain all words found in a text, therefore some system for predicting pronunciation from the word itself is necessary. Here we present a general framework for building letter to sound (LTS) rules from a word list in a language. The technique can be fully automatic, though a small amount of hand seeding can give better results. We have applied this technique to English (UK and US), French and German. The generated models achieve, 75%, 58%, 93% and 89%, respectively, words correct for held out data from the word lists. To test our models on more typical data we also analyzed general text, to find which words do not appear in our lexicon. These unknown words were used as a more realistic test corpus for our models. We also discuss the distribution and type of such unknown words.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Performance Comparison between Human Engineered & Machine Learned Letter-to-sound Rules for English: a Machine Learning Success Story

The task of mapping spelled English words into strings of phonemes and stresses (\reading aloud") has many practical applications. Several commercial systems perform this task by applying a knowledge base of expert-supplied letter-to-sound rules. This paper presents a set of machine learning methods for automatically constructing letter-to-sound rules by analyzing a dictionary of words and thei...

متن کامل

Dialect variation in Boro Language and Grapheme-to-Phoneme conversion rules to handle lexical lookup fails in Boro TTS System

It is not possible to include all the words in a natural language for general text-to-speech system. Grapheme-tophoneme conversion system is essential to pronounce a word which is out of vocabulary. Grapheme-to-phoneme rules play a vital role where lexical lookup fails. Though basic Grapheme-tophoneme rules system is very simple yet it is very powerful for naturalness of a TTS system. Letter-to...

متن کامل

Effect of sound classification by neural networks in the recognition of human hearing

In this paper, we focus on two basic issues: (a) the classification of sound by neural networks based on frequency and sound intensity parameters (b) evaluating the health of different human ears as compared to of those a healthy person. Sound classification by a specific feed forward neural network with two inputs as frequency and sound intensity and two hidden layers is proposed. This process...

متن کامل

Welsh letter-to-sound rules: rewrite rules and two-level rules compared

In a text-to-speech synthesis system, input words not found in the system's lexicon are passed to letter-to-sound rules, which derive the word's pronunciation. In Welsh, the letter-to-sound rules must be applied in three passes: firstly, to add epenthetic vowels, secondly, to determine stress and vowel location, and thirdly, to perform grapheme-to-phoneme conversion. To begin with, all these le...

متن کامل

Pathology of Legislation and Legal Approvals Related to Texture and Building Management

Today, the importance and special position of the historical texture and buildings in the city is completely obvious. However, the issue of protecting, supporting and restoring historical texture and valuable historical buildings in Iran is not seriously, scientifically and legally pursued by top level of management and popular demand. Historical texture and buildings are increasingly in danger...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998